Effective Visualizations

Now that you know how to create graphics and visualizations in R, you are armed with powerful tools for scientific computing and analysis. With this power also comes great responsibility. Effective visualizations is an incredibly important aspect of scientific research and communication. There have been several books (see references) written about these principles. In class today we will be going through several case-studies trying to develop some expertise into making effective visualizations.

Worksheet

The worksheet questions for today are embedded into the class notes.

You can download this Rmd file here

Note, there will be very little coding in-class today, but I’ve given you plenty of exercises in the form of a supplemental worksheet (linked at the bottom of this page) to practice with after class is over.

Resources

  1. Fundamentals of Data Visualization by Claus Wilke.

  2. Visualization Analysis and Design by Tamara Munzner.

  3. STAT545.com - Effective Graphics by Jenny Bryan.

  4. ggplot2 book by Hadley Wickam.

  5. Callingbull.org by Carl T. Bergstrom and Jevin West.

Part 1: Warm-up and pre-test [20 mins]

Warmup:

Write some notes here about what “effective visualizations” means to you. Think of elements of good graphics and plots that you have seen - what makes them good or bad? Write 3-5 points.

  1. Plots visually appealing (colours, graphs) in a way that makes it easy to read
  2. Informative labels(axis, title, legend, captions)
  3. Correct plot used for correct analysis (eg scatter for 2 quantitative, density plot for spread)
  4. Scales make sense(like log for GDP for ex. ) 5.Division of categories in plots in way that makes sense

CQ01: Weekly hours for full-time employees

Question: Evaluate the strength of the claim based on the data: “German workers are more motivated and work more hours than workers in other EU nations.”

Very strong, strong, weak, very week, do not know - << Weak because more hours work does not mean more productive work (have to look at productivity in different) or more motivation, also there is no way to tell if data = significant>>

  • Main takeaway: Include error bars to tell if significant, make sure axis start at 0

CQ02: Average Global Temperature by year

Question: For the years this temperature data is displayed, is there an appreciable increase in temperature?

Yes, No, Do not know - <>

  • Main takeaway: Big y axis range makes changes look insignficant, x axis is not labelled, 2C change is actually significant

CQ03: Gun deaths in Florida

Question: Evaluate the strength of the claim based on the data: “Soon after this legislation was passed, gun deaths sharply declined.”

Very strong, strong, weak, very week, do not know - >

  • Main takeaway: Due to way axis is presented , the opposite effect is shown

Part 2: Extracting insight from visualizations [20 mins]

Great resource for selecting the right plot: https://www.data-to-viz.com/ ; encourage you all to consult it when choosing to visualize data.

Case Study 1: Context matters

Y axis on both sides differ(Autism vs MMR coverage), correlation does not equal causation ### Case Study 2: A case for pie charts Mentioned that pie charts are not a good idea. ## Part 3: Principles of effective visualizations [20 mins]

We will be filling these principles in together as a class

Make a great plot worse

Instructions: Below is a code chunk that shows an effective visualization. First, copy this code chunk into a new cell. Then, modify it to purposely make this chart “bad” by breaking the principles of effective visualization above. Your final chart still needs to run/compile and it should still produce a plot.

library("tidyverse")
library("datasets") # might have to install datasets package
ggplot(airquality, aes(`Month`, `Temp`, group = `Month`)) +
    geom_boxplot(outlier.shape = NA) +
    geom_jitter(alpha = 0.2) +
    xlab("Month of year") + 
    ylab("Maximum Temperature") + 
    theme_bw()

How many of the principles did you manage to break?

Plotly demo [10 mins]

Did you know that you can make interactive graphs and plots in R using the plotly library? We will show you a demo of what plotly is and why it’s useful, and then you can try converting a static ggplot graph into an interactive plotly graph.

This is a preview of what we’ll be doing in STAT 547 - making dynamic and interactive dashboards using R!

For this demo, make sure you have the following packages installed and loaded:

library(tidyverse)
library(gapminder)
library(plotly) 

Make ggplot2 graphs interactive

It’s very easy to convert an existing ggplot2 graph into an interactive graph with plotly::ggplotly

On the below graph, explore the interactive options:

  • Hover your cursor over individual points
  • Zoom in and out by dragging across / using the zoom tool
  • Single- and double-click items on the legend to isolate groups of points
  • While zoomed-in, use the pan tool to “move” around the plot, google maps style!
p <- gapminder %>%
    ggplot(aes(x = gdpPercap, y = lifeExp, color = continent)) +
    geom_point() 
p %>%
    ggplotly()

Make interactive plots with plotly::plot_ly

We can also make interactive graphs using the the plotly::plot_ly function:

p <- gapminder %>%
    plot_ly(x = ~gdpPercap,
            y = ~lifeExp,
            color = ~continent,
            
            # mode specifies the geometric object e.g. "markers" for points, "line" for lines
            mode = 'markers',
            
            # type controls the "type" of graph e.g. 'bar', 'scatter'
            type = 'scatter'
            )
p

Share with others

To share with others:

  1. Create a plotly account @ plot.ly
  2. Navigate to settings, and take in the following information:
  • your user name
  • api key

Now, we will tell R your account information so that we can upload these plots to the web.

Note that once we run api_create(), the browser will open to a webpage displaying your interactive plot. You can share this page with others, but they will only be able to view. If you want others to be able to edit the graph, you can invite others to “collaborate” in the “Sharing link” option.

# fill in the below with your information
Sys.setenv("plotly_username"="your_plotly_username")
Sys.setenv("plotly_api_key"="your_api_key")
# upload our plots to the website
api_create(p, filename = 'name-of-your-plot')

Supplemental worksheet (Optional)

You are highly encouraged to the cm013 supplemental exercises worksheet. It is a great guide that will take you through Scales, Colours, and Themes in ggplot. There is also a short guided activity showing you how to make a ggplot interactive using plotly.